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Abstract 

Epigenetic genome marking and chromatin regulation are central to establishing tissue-specific gene expression 
programs, and hence to several biological processes. Until recently, the only known epigenetic mark on DNA in 
mammals was 5-methylcytosine, established and propagated by DNA methyltransferases and generally associated 
with gene repression. All of a sudden, a host of new actors — novel cytosine modifications and the ten eleven trans- 
location (TET) enzymes — has appeared on the scene, sparking great interest. The challenge is now to uncover the 
roles they play and how they relate to DNA demethylation. Knowledge is accumulating at a frantic pace, linking 
these new players to essential biological processes (e.g. cell pluripotency and development) and also to cancerogen- 
esis. Here, we review the recent progress in this exciting field, highlighting the TET enzymes as epigenetic 
DNA modifiers, their physiological roles, and their functions in health and disease. We also discuss the need to find 
relevant TET interactants and the newly discovered TET-O-linked N-acetylglucosamine transferase (OGT) pathway. 

Keywords: epigenetics; DNA methylation; hydroxymethyiation; TET proteins; OGT 



INTRODUCTION 

DNA methylation in mammals involves covalent 
adding of a methyl group, most commonly to the 
5 r -position of cytosine in a CpG dinucleotide. CpG 
methylation is essential to normal development, 
probably because of its importance in transposable 
element silencing, X chromosome inactivation, gen- 
omic imprinting and the regulation of tissue-specific 
gene expression [1]. The genomic distribution of 
DNA methylation is gene and tissue specifics, and 
while the number of methylated CpG sequences is 
significant, there exist regions of high CpG density, 
termed CpG islands (CGI), that can be refractory to 
methylation when associated with active gene pro- 
moters [2, 3]. Waves of change affect the global 
DNA methylation pattern at specific times during 
early development, while more subtle, localized 
changes (demethylation or methylation) may occur 
in somatic cells in response to specific signals or sti- 
muli [4]. DNA methylation patterns are established 
early in the zygote by the de novo DNA 



methyltransferases 3A and 3B (DNMT3A/3B) and 
they are conserved during cell divisions by the main- 
tenance DNA methyltransferase DNMT1. These 
enzymes transfer the methyl group from 5-adenosyl- 
methionine to cytosine (usually in a CpG context), 
producing 5-methylcytosine (5-mC) [5]. 

A longstanding mystery in the epigenetic field 
surrounds the mechanisms allowing transition from 
the methylated to the unmethylated state. DNA 
demethylation can occur both passively and actively. 
'Passive' DNA demethylation means progressive di- 
lution of methylcytosine through mitosis. In this 
case, DNMT1 is excluded from the replication 
fork, so that the neosynthesized strand is kept 
unmethylated. 'Active' DNA demethylation states 
for the rapid, replication independent, enzymatic re- 
moval of methylcytosine. Such 5-mC 'erasers' have 
been intensively sought but have long remained elu- 
sive [6, 7]. As the methyl group of 5-mC is thermo- 
dynamically very stable and thought not to be 
directly removed from cytosine, one of the many 
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possible mechanisms for active DNA demethylation 
would be 'cutting' of the glycosyl bond between the 
ribose and the pyrimidine base. However, this would 
require either selective targeting of glycosylases to 
the genomic regions that have to be demethylated 
or the presence of further modifications on the 
methylcytosine to be removed. 

In 1952, Wyatt and Cohen [8] made an important 
discovery: they found 5-hydroxymethylcytosine 
(5-hmC) in the DNA of the T-even bacteriophages 
where C is almost completely replaced by 5-hmC 
and where 70% of 5-hmC is further glycosylated. 
These modifications have probably arisen from the 
'battle' between bacterial restriction enzymes and 
bacteriophages [9]. For nearly 60 years, researchers 
studying the role of 5-hmC focused solely on bac- 
teria and bacteriophages, until 2009, when two 
groups described this so-called 'sixth base' in mam- 
malian genomic DNA. A first report described the 
presence of 5-hmC in Purkinje neurons and granule 
cells of the mouse brain. The other study showed 
that the Ten Eleven Translocation 1 enzyme (TET1) 
(discussed later) can convert 5-mC to 5-hmC, both 
in cultured cells and in vitro, and that mouse embry- 
onic stem cells (mESCs) contain a significant fraction 
of 5-hmC (0.032% of all bases) [10, 11]. These two 
seminal studies opened new avenues in epigenetic 
research and raised many essential questions regard- 
ing the roles of this new epigenetic modification, not 
only in brain development and pluripotency but also 
in transcriptional regulation, DNA demethylation 
and cancer. 

In this review, we first introduce the TET pro- 
teins, their enzymatic activities and their roles in 
DNA demethylation. We then present what is 
known about TETs and 5-hmC genomic localiza- 
tion, notably from epigenomic studies on ESCs and 
on brain. We further examine the role of TET pro- 
teins in transcriptional regulation, as regards both 
their catalytic activity and their potential 
'non-catalytic' functions. Finally, we provide an 
update on the involvement of deregulated hmC 
and TET levels in cancerogenesis. 



THE TET FAMILY OF ENZYMES 
CATALYZES CONVERSION OF 
5 -METHYLCYTOSINE TO 
MULTIPLE MODIFIED CYTOSINES 

TET proteins were initially discovered through their 
involvement in myeloid leukemia where the TETI 



gene, located on chromosome 10, can translocate 
with the H3K4 histone methyltransferase MLL gene 
on chromosome 11 [12]. TET enzymes are members 
of the TET/J-binding protein (JBP) family of 
a-ketoglutarate- and iron (II) -dependent dioxy- 
genases, closely related to the JBP 1 andJBP2 proteins 
found in kinetoplastids such as trypanosomes and 
leishmanias. In mammals, the TET/JBP family is 
composed of the founding member TETI along 
with TET2 and TET3. These three genes encode 
proteins sharing a double-stranded (3-helix-fold and 
a cysteine-rich region within the catalytic domain. 
Although TETI and TET3 harbor an N-terminal 
CXXC DNA-binding domain, TET2 seems to 
have lost it during evolution. Interestingly, TET2 
CXXC exists as a separate gene, also called IDAX 
or CXXC4, which encodes an inhibitor of Wnt sig- 
naling. This suggests a connection between the Wnt 
pathway and the TET proteins [11, 13, 14]. As the 
thymine hydroxylase activities of JBP 1 and JBP2 are 
involved in producing base J ((3-D-glucosyl- 
hydroxymethyluracil) in kinetoplastids, several labs 
have investigated the possibility that TETI might 
catalyze the conversion of 5-mC to 5-hmC. 
Furthermore, as overexpression of wt TETI but not 
the mutant counterpart was found to decrease methy- 
lation in transfected cells, Anjana Rao's group 
suggested that TETs might potentiate DNA 
demethylation and that 5-hmC might be an inter- 
mediate in the pathway to unmodified cytosine 
[11]. This 'methylcytosine hydroxylase' activity was 
extended to Tet2 and Tet3 by Yi Zhang's lab, which 
also found that Tetl knock-down impairs mESC 
self-renewal and maintenance [15]. In fungi such 
as Neurospora crassa and Aspergillus nidulans, thymine- 
7-hydroxylase catalyzes sequential conversion of 
thymine to 5-hydroxymethyluracil (5-hmU), 
5-formyluracil (5-fU) and 5-carboxyuracil (5-caU) 
[16, 17]. This encouraged researchers to look for fur- 
ther oxidized forms of methylcytosine. In a seminal 
paper, Ito etal. [18] demonstrated that all three Tets 
can oxidize 5-hydroxymethylcytosine iteratively to 
5-formylcytosine (5-fC) and 5-carboxycytosine 
(5-caC), and that these 'adducts' are physiologically 
present in several tissues including mESCs. TETs- 
mediated 5-caC synthesis was further confirmed by 
He etal. [19], who additionally found thymine DNA 
glycosylase (TDG) to remove 5-caC from DNA. This 
suggests that TETs and the base excision repair (BER) 
machinery might work together to actively remove 
DNA methylation. 
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TET PROTEINS AND CYTOSINE 
MODIFICATIONS: DIFFERENT 
LEVELS IN DIFFERENT CELLS 

Although the proportion of 5-methylcytosine in 
genomic DNA seems constant (3-4% of total cyto- 
sines), 5-hydroxymethylcytosine varies much more 
between tissues. The brain and spinal cord appear 
particularly rich in 5-hmC (respectively ^0.80% 
and ~0.45%). Other organs such as the testes, spleen 
and thymus show very little 5-hmC (<0.10%). ESCs 
and organs as kidneys, heart and lungs display an 
intermediate level of this epigenetic modification 
[15, 20, 21]. Cell cultures show a very small 
amount of 5-hmC, and unlike 5-mC, this mark 
gradually decreases upon culturing of normal breast 
tissue for example [22]. It is noteworthy that 5-fC 
and 5-caC, although much less abundant than 
5-hmC, are both detectable in mESCs, and that 
5-fC is also present in tissues such as brain, spleen, 
liver and pancreatic tissues [18]. TET expression also 
varies between cells/organs: while TET2 and TET3 
are expressed in various tissues, only ESCs appear 
rich in TET1 [18, 23]. Surprisingly, the TET2 and 
TET3 expression profiles are often similar, suggesting 
that these enzymes act in concert [18, 22]. Another 
point worth stressing is the absence of correlation 
between TET expression and 5-hmC abundance 
[22], a fact worthy of future study. In line with 
observations on cultured cells, the level of 5-hmC 
is significantly reduced in several sarcomas, notably of 
the lung, breast, colon, liver, brain and prostate. 
These reductions might mirror, at least partially, 
the global hypomethylation phenotype usually 
found in various cancers [24—26]. Mitochondrial 
DNA is also rich in both hydroxymethylcytosine 
and methylcytosine, and although no mitochondrial 
targeting sequence has been found in TET proteins, 
western blots of mitochondrial extracts show the 
presence of Tetl and Tet2 [27-29]. It thus seems 
that in mammals, oxidized 5-methylcytosine deriva- 
tives are present at various levels according to the cell 
type, and that in some cells at least, they are likely to 
have important functions. 

DNA DEMETHYLATION AND THE 
TET PROTEINS 

The occurrence of 5-hmC, 5-fC and 5-caC in mam- 
malian tissues led the scientists to investigate what 
role(s) they might play in 5-mC demethylation. 
Although focal/local DNA demethylation is a 



mechanism enabling genes to respond to various sti- 
muli, global DNA demethylation is crucial to erase 
epigenetic memory and epimutations during devel- 
opment. At this time, the phenomenon appears to 
give a 'facelift' to DNA, enabling it to evolve along 
with its new environment. This section deals with 
what is known or suspected about the roles of the 
TET proteins and the intermediates of 5-mC oxida- 
tion in global and focal DNA demethylation. 

Global DNA demethylation 

In mammals, there are two waves of global DNA 
demethylation: one occurring when the unipotent 
primordial germ cells (PGCs) migrate to the future 
gonads and the other during fertilization. At the time 
of PGCs migration, the paternal and maternal gen- 
omes undergo an active genome-wide demethyla- 
tion by little-known mechanisms, although a 
recent publication showed that this phenomenon 
occurs by 5-hmC conversion of methylcytosine, 
probably by Tetl and Tet2, and a subsequent 
replication-dependent dilution of the 5-hmC mark 
[30] . De novo methylation patterns and genomic im- 
prints are established by DNMT3 during spermato- 
genesis at embryonic day 16 (ED 16) and in oocytes 
afterbirth. The second wave of global demethylation 
occurs in the zygote soon after fertilization. In this 
case, two mechanisms must be considered: active 
demethylation of the paternal pronucleus just after 
fertilization and delayed passive demethylation of the 
maternal genome. Finally, methylation profiles are 
reestablished around the time of fetal implantation. 
It is worth mentioning that the genomic imprints 
created during PGC formation are not affected by 
the demethylation/remethylation events occurring 
in the zygote [4, 31]. Although the exclusion of 
Dnmtlo (the oocyte version of somatic Dnmtl) 
explains how maternal genomic DNA becomes 
demethylated through multiple rounds of replica- 
tion, the paternal pronucleus demethylation system 
has been explained only very recently [32-37]. In 
2011, Iqbal et al. [32] showed that the paternal 
genome is hydroxymethylated upon entry of the 
sperm into the oocyte, while the maternal counter- 
part shows no hydroxymethylation and remains 
methylated. Tet3 is expressed in the mouse oocyte 
and upregulated upon fertilization. Importantly, the 
5-hmC modification remains during the first cell 
divisions, suggesting that it is not removed. Also, in 
2011 were published two other reports showing that 
Tet3 silencing can impede paternal pronucleus 
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hydroxymethylation causing a hypermethylation 
phenotype. Interestingly, one group discovered that 
the Pgc7 protein (also called Dppa3/Stella) maintains 
methylation in the maternal genome, since Pgc7 
knockout renders maternal 5-mC accessible to oxi- 
dation in 5-hmC [33, 34]. In 2012, Nakamura etal. 
[37] revealed that both the maternal pronucleus and 
the paternal imprinted genes are protected from 
Tet3-mediated hydroxymethylation/ demethylation 
by binding of Pgc7 to the repressive H3K9me2 his- 
tone mark. Finally, it has recently been observed that 
in the paternal pronucleus, 5-hmC is further oxi- 
dized to 5-fC and 5-caC and that these modifications 
are diluted through multiple cell divisions [35]. 
A simplified model would thus be as follows: upon 
fertilization, the paternal pronucleus is actively 
demethylated through iterative oxidations (5-mC 
— >• 5-hmC — > 5-fC — 5-caC) mediated by maternal 
Tet3, and both the imprinted genes and the maternal 
genome are kept methylated by Pgc7-H3K9me2 
binding. We wish to point out that despite the 
active loss of the methyl group, the successive inter- 
mediates of 5-mC oxidation are progressively diluted 
with mitosis. This remark is further extended to 
PGCs global demethylation where 5-hmC decreases 
with replication. Thus, the term 'active' DNA 
demethylation of the paternal genome/PGCs 
should only refer to the 'active' loss of the methyl 
group perse, keeping in mind that the carbon-carbon 
bond is passively lost during replication. 

Focal DNA demethylation 

Focal DNA demethylation events occur primarily 
upon external stimulation of the cell. Until the dis- 
covery of the TET enzymes and of the oxidized 
forms of 5-mC, the only known ways a gene 
could be demethylated in mammals were through 
replication (passive demethylation) or via DNA 
repair (active demethylation) [9]. There was one 
report nearly 15 years ago suggesting that MBD2b, 
a variant of MBD2, can remove the methyl group of 
5-mC by breaking the covalent carbon— carbon bond 
[38], but the results described could not be con- 
firmed by others. DNA repair includes BER and 
nucleotide excision repair (NER), i.e. the removal, 
respectively, of a few nucleotides or of a long stretch 
of DNA from a damaged site. BER and NER are 
both essential to maintain the integrity of the 
genome, and their alteration may result in diseases 
such as neurological syndromes, growth defects, or 
UV sensitivity and skin cancer [39]. The choice of 



the pathway used to repair DNA depends on the 
type of lesion: the BER machinery is recruited to 
single strand breaks or damaged bases and the 
NER components act to remove pyrimidine 
dimers and other bulky adducts. Such damages can 
have various causes including UV radiation, oxida- 
tive stress and chemical agents [39]. The growth 
arrest and DNA damage protein 45 (GADD45) 
seems to play a pivotal role in active DNA demethy- 
lation, stabilizing the NER and BER components 
and/or targeting them to methylated cytosines. 
GADD45 can act in concert with the XPA, XPC, 
XPG and XPF proteins to remove methylated 
nucleotides in a NER-dependent fashion [40—42]. 
GADD45 has also been identified in complexes 
with AID and TDG or AID and MBD4. In these 
latter cases, 5-mC is first de-aminated by AID into 
thymine, introducing a T:G mismatch that is later 
resolved by the glycosylase activity of TDG or 
MBD4 and by the BER components (Figure 1A) 
[43, 44]. 

TETs-mediated oxidation of 5-mC can also lead 
to local active DNA demethylation (Figure IB). 
Some results suggest that Tetl promotes demethyla- 
tion of the pluripotency gene Nanog in mESCs: upon 
Tetl knockdown, a significant decrease in Nanog 
expression was observed, accompanied by pluripo- 
tency impairment [15]. Although further studies sup- 
port this view, especially by demonstrating that 
Nanog is a direct target of Tetl [45, 46], other reports 
contradict it [47-52]. Ficz et al [49] and Williams 
et al [51] did not observe a deregulation of Nanog 
upon Tetl depletion, and Koh et al. [50] found 
Tetl and Tet2 knockdown neither to affect Nanog 
expression nor to induce mESCs differentiation. 
Most importantly, Dawlaty et al. [47] showed with 
a mouse knockout that Tetl was not required for 
maintaining pluripotency, and its loss did not effect 
in embryonic or postnatal development. These dis- 
crepancies might be attributable to differences in 
mESCs background or culture conditions, to 
RNAi off- targets, or to gene compensation, and 
therefore it will be important to further study the 
role of Tetl in embryonic development [14]. 
Another example of TETs-mediated DNA 
demethylation was shown in 2012 by Thomson 
et al. [53] in mice treated with phenobarbital (PB), 
a known non-genotoxic carcinogen inducing liver 
tumors. Upon treatment, a set of 30 genes become 
upregulated in hepatocytes, and this correlates with 
an increase in 5-hmC and a decrease in 5-mC. 
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Figure I: DNA demethylation and theTET proteins. (A) Passive (upper green arrow) and active (lower red arrows) 
DNA demethylation networks independent of the TET enzymes. 5-methylcytosine can be passively lost upon exclu- 
sion of DNMTI at the time of DNA replication. Active demethylation occurs via the GADD45 -mediated BER and 
NER pathways. The base excision repair (BER) mechanism involves deamination of 5-mC to thymidine by AID; 
then theTG mismatch is recognized by the MBD4 orTDG glycosylase, which removes the base, exposing an AP- 
(apurinic/apyrimidinic) site resolved by the BER machinery. The nucleotide excision repair (NER) pathway involves 
direct removal of the 5-mC nucleotide by XP enzymes (XPA/XPC/XPG or XPF) and other NER components. (B) 
Passive (right green arrows) and active (left red arrows) DNA demethylation networks mediated by the TET 
enzymes. 5-Methylcytosine can be oxidized byTETs to 5-hmC, 5-fC and 5-caC. DNMTI shows a decreased affinity 
for 5-hmC-containing and, possibly, 5-fC- and 5-caC-containing DNA, leading to progressive dilution of these mod- 
ified nucleotides. Cytosine may also be directly decarboxylated by a yet unknown enzyme (referred to here as X). 
AID/APOBEC-mediated deamination of 5-hmC produces 5-hmU, which can be recognized and removed by 
the SMUGI glycosylase and the BER machinery. Finally, the TDG enzyme can react with 5-fC and 5-caC and 
mediate BER-dependent DNA demethylation. 



Among the upregulated genes, Cyp2bl0 (which is 
sometimes deregulated in liver tumors) displays a dy- 
namic change in methylation/hydroxymethylation 
pattern: on day 1 after PB exposure, 5-hmC is al- 
ready detected throughout the gene body and up- 
stream from the transcription start site, whereas there 
is no significant change in DNA methylation. On 
day 7, the gene begins to show a decrease in 
5-mC, but there is no substantial change in 
5-hmC. On day 91, the gene becomes fully 
unmethylated, exhibiting no residual 5-hmC. This 
experiment proves that Cyp2bl0 transits progressively 
from a methylated to a fully unmethylated state 
through 5-mC oxidation. A last example of 
TET-mediated demethylation was demonstrated by 
Wang et al, who found that in mouse brain, TET1 
overexpression demethylates the Bdnf and Fgfl pro- 
moters in a BER-dependent manner, causing 
increased expression of the corresponding genes. 
Interestingly, the same results are found with AID 



overexpression. This suggests a possible link between 
a deaminase activity and 5-hydroxymethylcytosine: 
the deamination product of 5-hmC would yield 
5-hmU, a DNA modification found upon overex- 
pression of TET1 or AID/APOBEC deaminase 
family members and which can be removed by the 
SMUGI glycosylase [54]. It is worth mentioning 
that AID was initially discovered in B-lymphocytes 
undergoing antigen-mediated class switch recombin- 
ation (CSR). In these cells, AID is recruited to the 
immunoglobulin heavy-chain (IgH) locus and 
induces double strand breaks by deamination of 
cytosines to uracils. These lesions are then resolved 
by the non-homologous end-joining pathway, 
allowing IgH recombination and antibody isotype 
switching [55]. Although this has not been fully 
investigated, some studies have revealed methylation 
of the IgH locus, showing that demethylation paral- 
lels CSR [56, 57]. It is tempting to speculate that 
methylation of the IgH locus might protect it from 



196 



Delatte and Fuks 



AID-mediated deamination, and that upon cell acti- 
vation, the TET enzymes might hydroxylate 5-mC, 
producing 5-hmC to be recognized by AID and dea- 
minated to 5-hmU. Subsequent removal of 5-hmU 
by a DNA glycosylase would lead to double strand 
breaks and CSR. Further investigations should be 
conducted to ascertain whether the TET enzymes 
act on the antibody-mediated immune response. 

All in all, it thus seems that there is no consensus 
'path' to DNA demethylation: it can be passive 
through repelling of DNMT1, or it can be active 
and this, dependently or independently of the TET 
enzymes (Figure 1A and B). 

GENOME-WIDE MAPPING OF 
TET1 AND OXIDIZED 
METHYLCYTOSINES: NOT 
THAT SIMPLE 

To understand accurately the functions of 5-hmC, 
5-hmU, 5-fC and 5-caC, it is important to develop 
systems to map these modifications. Until 2009, 
the gold standard method for analyzing 5-mC was 
bisulfite DNA sequencing, but after Anjana Rao 
discovered that TET1 mediates 5-hmC formation, 
she further found out that 5-hmC is also protected 
from chemical deamination by bisulfite treatment, 
rendering this technique unable to distinguish 
between 5-mC and 5-hmC [58]. The discovery of 
5-fC and 5-caC has further complicated the situ- 
ation, as they are both interpreted as unmodified 
cytosines after bisulfite treatment [59]. Biologists 
and chemists have thus united their efforts to develop 
a plethora of techniques to map modified cytosines, 
all of which rely on DNA deep sequencing ('next- 
generation' sequencing). Although these methods 
are reviewed elsewhere [59], we present here some 
information on 5-hmC and 5-fC patterns in mESCs 
and the mouse brain and how they relate to Tetl 
genomic-binding profiles. 

Several groups have reported 5-hmC mapping by 
deep sequencing in mESCs and showed (Figure 2A 
and B) that gene-rich regions are particularly abun- 
dant in hydroxymethylcytosine, mostly at unmethy- 
lated CGI promoters and in the gene bodies of 
(highly) transcribed genes, especially in exons where 
methylation is also found. Both 5-hmC and 5-mC are 
also present within enhancers and repetitive elements 
such as LINE1 repeats (Figure 2A and B) [49, 51, 52, 
60-62] . It seems, however difficult to reach a consen- 
sus as recent base-resolution mapping (TAB-seq) 



in human and mouse ESCs identified 5-hmC 
mostly at distal regulatory elements and in low CpG 
content sequences, but not in promoters or gene 
bodies CGI [62]. These discrepancies could rise 
from the techniques used to map 5-hmC or the cell 
types used in the experiments and further high reso- 
lution maps will help us to uncover precise and quan- 
titative hydroxymethylcytosine location that is central 
to understanding its function. It is worth mentioning 
that 5-hmC can also decorate non-CpG (CpH) 
dinucleo tides. CpA methylation is thought to play a 
regulatory role in pluripotency, as it disappears when 
ESCs differentiate and is restored upon reprogram- 
ming to obtain induced pluripotent stem cells (iPSCs) 
[63]. Chromatin immunoprecipitation sequencing 
(ChlP-seq) of mouse Tetl has revealed preferential 
binding of this protein to unmethylated CGI pro- 
moters and, to a lesser extent, to exons of poised bi- 
valent (H3K4me3 and H3K27me3-marked) and 
active (H3K4me3-marked) genes [46, 51, 52]. 
Interestingly, some hydroxymethylated regions do 
not appear to be bound by Tetl, suggesting that 
they are either technically not identified by 
ChlP-seq or that another Tet (Tet2 for example) 
regulates these specific territories [51]. In 2012, 
Raiber etal. [64] described for the first time formyl- 
cytosine mapping in mESCs. Along with 5-hmC 
reports, 5-fC was found in DNA repeats, CGI pro- 
moters and gene bodies (mostly in exons) of tran- 
scribed H3K4me3-marked genes, paralleling Tetl 
binding. These authors also found 5-fC to be regu- 
lated by TDG glycosylase, as silencing of the latter 
leads to 5-fC accumulation (Figure 2A). Finally, 
5 -fC- containing genes seem to be more expressed 
than 5-hmC-containing genes [64]. Interestingly, it 
would seem that some promoters rich in 5-hmC and 
5-fC display no 5-mC, and that some exons show 
localized 5-mC, 5-hmC and 5-fC enrichment [49, 
51, 64]. As Tetl binds to both CGI promoters and 
exons [46, 51, 52], this suggests that a certain 'event' 
stabilizes 5-mC in gene bodies, especially in exons. It 
is conceivable that the rate of TET 1 -catalyzed 5-mC 
— > 5hmC reaction depends on its location, while 
5-hmC — > 5fC rate does not. If so, the first reaction 
might be faster in CGI promoters than in gene bodies. 
This would explain why CGI promoters appear 
unmethylated, whereas 5-mC plays a regulatory 
role in exons. There is emerging evidence suggesting 
that 5-mC act in mRNA splicing: several studies have 
shown that exons more frequently contain 5-mC 
than do introns, and that nucleosomes exhibit 
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prox. promoter 1 gene body 

Figure 2: (A) mESCs epigenetic landscape of an active gene. Tetl can bind to chromatin at high-CpG-content CGI 
proximal promoters, where it can maintain an unmethylated state through iterative 5-mC oxidation in concert 
with TDG glycosylase (note that TDG can also be localised in exons containing CGI). Tetl and 5-hmC are also 
found in active regulatory regions such as enhancers marked with the H3K4mel and H3K27ac histone modifications. 
Exons are richer than introns in various forms of modified cytosines (5-mC, 5-hmC, 5-fC, and potentially 5-caC), 
and these might bind specific readers cooperating with the RNAPII and/or mRNA processing machinery (see light 
blue arrow with a question mark). (B) mESCs epigenetic landscape of an inactive gene (with bivalent promoter). 
Tetl binding to the CGI proximal promoter might be associated with indirect recruitment (potentially via DNA 
hydroxymethylation) of the PRC2-containing Ezh2 complex and also with direct interaction with the Sin3a repres- 
sive complex. Sin3a and Ezh2 would subsequently maintain histone H3 in a deacetylated and K27 trimethylated 
form, thus preventing transcriptional activation. 



preferential positioning at exon— intron boundaries 
[65—68]. Shukla etal. [65] discovered that binding of 
the CTCF factor to exons flanked by weak splice sites 
can promote their retention by slowing down the 
RNA polymerase II (RNAPII). They further re- 
vealed that inhibition of CTCF binding by DNA 
methylation can lead to reciprocal effects on exon 
inclusion. One might expect 5-mC oxidation inter- 
mediates also to be involved in RNA processing. 
Accordingly, Khare et al [68] have shown with 
tiling arrays that in mouse and the human brain, 



exon— intron boundaries are enriched with 5-hmC 
and that constitutive exons show a distinct 5-hmC 
pattern different from that of alternatively spliced 
exons. Moreover, in vitro studies indicate that the 
rate of RNAPII elongation is reduced in DNA con- 
taining the further oxidized forms 5-fC and 5-caC. 
These marks might thus, like CTCF, affect the choice 
of the retained exon [69]. It will be most interesting 
to learn more about the role of TETs and the 5-mC 
oxidation status in mRNA splicing as well as the 
factors regulating TET kinetics [70, 71]. 
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In 2011, a group produced deep-sequencing 
maps of hydroxymethylcytosine in the mouse brain 
(cerebellum and hippocampus). These results are 
comparable to those obtained for mESCs, except 
that 5-hmC seems absent from the promoters of 
neuronal tissue. To date, there are no maps of Tets 
binding that might link these 5-hmC patterns to the 
hydro xymethylation machinery [72, 73]. 

TETS IN TRANSCRIPTIONAL 
REGULATION: ARE THEY MORE 
VERSATILE THAN ANTICIPATED? 

In addition to their role in DNA demethylation, the 
TET proteins may regulate expression independently 
of their enzymatic activity. This view stems from 
studies by Yi Zhang's and Kristan Helin's labs [46, 
51]. The first group showed that in mESCs, Tetl 
can bind to H3K4me3- and H3K27me3-enriched 
promoters and also recruit the Ezh2 Polycomb 
group protein (H3K27me3 'writer') (Figure 2B). 
Although no direct interaction between Ezh2 and 
Tetl was detected, Ezh2 binding was found to de- 
crease upon Tetl knockdown [46]. The second 
group described the first interactant of TET proteins, 
showing that the Sin3a histone deacetylase repressive 
complex associates with Tetl. More importantly, 
genome-wide Sin3a profiles revealed that this pro- 
tein occupies a significant fraction of Tetl -bound 
regions and that, upon silencing of Tetl or Sin3a, 
a subset of targets to which they co-bind are upre- 
gulated [51]. It thus seems that on repressed pro- 
moters (mostly bivalent in mESCs), Tetl might 
recruit the Sin3a complex (directly) and the Ezh2 
complex (indirectly), enabling these complexes in 
turn to maintain histones H3 in a deacetylated and 
K27 trimethylated form, therefore acting on gene 
repression. Interestingly, the presence of Sin3a is 
reported as an epigenetic feature linked to alternative 
mRNA splicing. This strengthens the view that 
TETs may play a role in RNA processing [74]. 
Future studies will be needed to identify additional 
TET partners, notably partners of TET2 and TET3. 
These are still early days, but we can expect novel 
interactors to be discovered in the near future, and 
this should shed light on the mode(s) of action of the 
TET proteins. In this context, it is worth mentioning 
some proteomics studies [75] that have identified 
strong binding between TET2, TET3 and the O- 
linked N-acetylglucosamine transferase (OGT) gly- 
cosyl transferase, which adds regulatory O-GlcNAc 



moieties to numerous proteins [76]. ChlP-seq studies 
have shown that TET2, TET3 and OGT co-localize 
on chromatin, at least in part at active promoters rich 
in H3K4me3. Furthermore, TET2 and TET3 pro- 
mote OGT activity, which in turn stabilizes the 
Set 1 /COMPASS complex, responsible for bulk de- 
position of H3K4me3 (Figure 3) [75]. Consistent 
with these observations, ChlP-seq experiments in 
bone marrow from Tet2 knockout mice show a 
strong decrease of O-GlcNAc and H3K4me3 depos- 
ition at Tet2 bound genes, among those some key 
hematopoietic and epigenetic regulators [75 and 
Deplus, Delatte & Fuks, unpublished data]. Recent 
findings by Chen etal. [77] support the TET-OGT 
link in mESCs where they found that TET2 interacts 
with and recruits OGT to chromatin. OGT in 
turn glycosylates histone H2B on serine 112 (H2BS 
112GlcNAc), a modification known to mediate H2B 
K120 ubiquitinylation (H2BK120ub), a prerequisite 
for trimethylation of H3K4 by Setl /COMPASS 
[78, 79]. These results thus suggest a novel means 
by which TETs might induce transcriptional acti- 
vation through H3K4me3 by (1) OGT-mediated 
glycosylation of H2BS112 and further ubiqui- 
tinylation of H2BK120 (2) OGT-mediated 
Setl /COMPASS stabilization, binding of Setl/ 
COMPASS to H2BK120ub and trimethylation of 
H3K4 (Figure 3). The TET proteins affect gene 
expression with a dual mode of action, dependently 
or independently of DNA methylation status. Such 
dual mode of transcriptional regulation has been 
proposed by previous studies [14]. One mode 
might thus act on establishing an H3K4me3 active 
chromatin landscape through OGT and Setl/ 
COMPASS, and the other on generating an 
H3K27me3 inactive state of chromatin via, for ex- 
ample Sin3a and Ezh2. 

As mentioned in the earlier section, TETs might 
influence DNA methylation in more than one way. 
Although 5-mC is supposed to be rapidly erased by 
oxidation at CGI promoters in mESCs, the 5-hmC 
and 5-fC intermediates seem quite stable. This is 
strengthened by the findings of Doege et al. [48], 
showing that in iPSCs generated from mouse em- 
bryonic fibroblasts by overexpression of Oct4, Sox2, 
Klf4 and c-Myc, Tet2 is recruited to pluripotency 
genes such as Nanog and Esrrb. Strikingly, although 
most of the Nanog methylation is lost upon iPSC 
formation, the 5-hmC modification persists and cor- 
relates with changes in histone marks: a decrease in 
H3K27me3 and an increase in H3K4me3. The 
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CGI (prox. Promoter) 



f=K4-methyl |= S112-GlcNAc ]= K120-Ub 



Figure 3: Proposed TETs-OGT-Setl/COMPASS functional link. (I) TET2 and TET3 (but also potentially TETI) inter- 
act with the OGT enzyme at GCI promoters and enhance its GlcNAc catalytic activity. TETs-mediated recruitment 
of OGT potentiates H2B glycosylation on serine 112 (H2BSII2GlcNAc) which in turn favors H2B ubiquitinylation on 
lysine 120 (H2BKI20ub). (2) Then, OGT glycosylates the Setl/COMPASS complex on its HCFI subunit, enhancing 
the complex stability. This would result in Setl/COMPASS binding to H2BKI20ub, trimethylation on H3K4 and in 
transcriptional activation. 



authors suggest that 5-hmC is an epigenetic modifi- 
cation perse, distinct from 5-mC. In a manner similar 
to MBD protein binding to 5-mC, there might exist 
5-hmC-specific readers that translate this mark by 
recruiting other transcription/epigenetic factors to 
the chromatin. There is evidence in favor of this 
model: in vitro, it has been shown that UHRF1/ 
NP95, an essential factor in maintenance DNA 
methylation, binds hemi- or fully hydroxymethy- 
lated DNA through its SRA binding-pocket [80]. 
Studies on mESCs have revealed that Mbd3/NurD 
can bind 5-hmC DNA via the MBD domain, that 
the genome-wide profile of Mbd3 is similar to that 
of Tetl, and that upon Tetl knockdown, Mbd3 
enrichment is decreased on Tetl/Mbd3 targets. 
This suggests a possible link between 5-hmC and 
the Mbd3/NurD gene repression machinery [81]. 
Finally, a recent paper from Mellen et al [82], 
showed that MeCP2 is able to bind 5-mC and 
5-hmC with high affinity in the nervous system 
and that the Rett syndrome associated mutation 
R133C preferentially inhibits 5-hmC binding. 
These exciting results highlight the potential import- 
ant roles of the TET proteins in neurological 
disorders. 

The TET enzymes thus appear to have functions 
beyond DNA demethylation, involving the recruit- 
ment of proteins acting on histone modifications and 



thereby on gene expression. Given the importance 
of the chromatin landscape in health and disease [83], 
it is important to conduct further large-scale studies 
to explore the unappreciated roles of TETs by iden- 
tifying novel interactants and specific readers of 
5-hmC, 5-hmU, 5-fC and 5-caC. 

TETS AND 

HYDROXYMETHYLATION: ARE 
THESE HALLMARKS OF CANCERS? 

Various cancers and transformed cells harbor dis- 
turbed epigenomes, as illustrated by DNA bulk 
hypomethylation and focal promoter hypermethyla- 
tion [84]. Cultured cell lines and tumors also display 
a drastic decrease in 5-hmC, possibly related to the 
global loss of 5-mC [24—26]. In melanoma cells, Lian 
etal [85] showed by genome-wide analysis a consid- 
erable reduction in 5-hmC marking, correlating with 
disease progression. They further found, in addition 
to a lesser- than-expected decrease in 5-mC in inter- 
genic regions, that more than 2000 genes having lost 
hydroxymethylcytosine were more methylated in 
melanoma cells than in normal melanocytes. Gene 
ontology analysis revealed an involvement of these 
genes in melanoma progression and in various cancer 
pathways. It is noteworthy that the TET enzymes 
were not mutated in the melanoma cells; rather, 
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their expression, especially TET2, was significantly 
reduced. Isocitrate dehydrogenase 2 (IDH2), which 
converts isocitrate to a-ketoglutarate in the Krebs 
cycle, also showed markedly decreased expression. 
As this enzyme produces a cofactor important for 
TET- mediated 5-mC hydroxylation, IDH2 down- 
regulation might result in a loss of the remaining 
TET activity, causing global hypohydroxymethyla- 
tion and hypermethylation of genes involved in mel- 
anoma progression. Alongside the generally accepted 
association between sarcomas and mutations in genes 
encoding key differentiation/proliferation regulators 
or involved in signaling (e.g. Ras-MAPK, STAT and 
PI3K), there is increasing evidence of an association 
with mutations affecting epigenetic factors and of a 
link between cell transformation and an altered chro- 
matin landscape [86]. For instance, myeloid malig- 
nancies such as myeloproliferative neoplasms 
(MPNs), myelodysplastic syndromes (MDS) and 
acute myeloid leukemia (AML) appear associated 
with mutations in the following genes: MLL and 
EZH2 (encoding 'writers' of H3K4me3 and 
H3K27me3 mark, respectively), ASXL1 (a protein 
thought to direct the EZH2 repressive complex to 
chromatin) and DNMT3A (likely explaining the 
global hypomethylation phenotype of myeloid 
malignancies). Interestingly, TET2 and IDH1/ 
IDH2, both involved in DNA demethylation [86], 
have also been found to be mutated in these pathol- 
ogies [86]. More precisely, somatic deletions and 
TET2-inactivating mutations are found in 4—13% 
of MPNs and 20-25% of MDSs with no clear prog- 
nosis [87, 88] and TET2 mutations are found in 
7—23% of AMLs, where they correlate with a poor 
prognosis [89]. To gain insights into the role of 
TET2 in leukemogenesis, Tet2-knockout mice 
have been generated by different laboratories. 
Although with some differences between the 
mouse strains (possibly due to the use of different 
knockout methods), the loss of Tet2 appears to 
increase the hematopoietic stem cell compartment 
and to skews cell differentiation towards the myeloid 
compartment, causing symptoms resembling those 
associated with TET2 mutations [90—92]. One 
might expect TET2 loss of function to link tumor 
suppressor gene hypermethylation with an aberrant 
hydroxymethylation pattern, but results are quite 
conflicting: while Figueroa et al [93] found that 
AML patients with TET2 mutations display 
decreased hydroxymethylation and increased DNA 
methylation, Ko et al. [94] found that TET2 loss of 



function is predominantly associated with reduced 
methylation at differentially methylated CpG sites. 
These differences in methylation pattern might be 
due to technical issues or to the type of disease 
(AML in the first case, MPNs/MDS in the second). 
TET2 ChlP-seq and genome-wide profiling of 
5-mC and 5-hmC on human MPN, MDS and 
AML samples will help to shed light on the role of 
TET2 in myeloid malignancies. IDH1 and IDH2 
loss-of-function mutations not only decrease 
a-ketoglutarate synthesis but also lead to accumula- 
tion of 2-hydroxyglutarate (2-HG), an oncometabo- 
lite found in the plasma of cancer patients carrying 
IDH mutations [95, 96]. 2-HG, by competing with 
a-ketoglutarate, can inhibit various cellular dioxy- 
genases, notably the TET enzymes [93, 97]. IDH1/ 
IDH2 mutations are found in 2.5-5% of MPNs, 
3.5% of MDSs and 15-33% of AMLs, and these mu- 
tations and TET2 mutations appear to be mutually 
exclusive [88, 96, 98—101]. In keeping with the view 
that TET and IDH proteins act in concert, AMLs 
with IDH mutations display a global and specific 
hypermethylation signature, potentially due to 
decreased TET activity [93]. 

Although decreased 5-hmC seems to be a 
common feature of various cancers, the link with 
TET expression/mutation levels and aberrant 
methylation is not as straightforward as one might 
expect. Systematic establishment of cancer-associated 
genome-wide maps of methylation, hydro- 
xymethylation and TETs binding will help scientists 
to assess the roles these events and enzymes play in 
cancer progression. The clinical value of using 
specific TET inhibitors to restore proper hydroxy- 
methylation patterns in cancers where these are 
altered remains to be explored. 

CONCLUSIONS 

The discovery of the TET proteins in 2009 sparked 
feverish interest: for many scientists, here at last was 
the DNA demethylase activity they had been seeking 
for so long [10, 11]. Many examples of TET- 
mediated demethylation have been observed, from 
paternal pronucleus reprogramming upon fertiliza- 
tion [32—37] to demethylation of cancer-related 
genes in PB-exposed mice [53] and of the Bdnf/Fgfl 
genes upon neuronal stimulation [54]. Yet, it remains 
a challenge to elucidate the exact pathway followed 
in each specific context (iterative oxidation, oxida- 
tion + deamination or even passive DNA demethyla- 
tion through repelling of DNMT1 [102]). To further 
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complicate the picture, Schiesser etal. [103] reported 
an intriguing 5-caC-decarboxylating activity in 
mESCs (Figure IB). DNA demethylation through 
decarboxylation might be less prone to mutations 
caused by DNA lesions and also less energy- 
consuming than demethylation mediated by DNA 
repair, which requires the action of many factors. 
Finding a 5-caC decarboxylase will thus be one of 
the next tasks for many epigenetic laboratories. 
Importantly, the TET proteins seem to be more 
than just DNA demethylases and transcription regu- 
lators. They appear as multitask proteins involved not 
only in pluripotency and embryonic development 
[45—52] but also, potentially, in RNA processing 
(e.g. splicing) [67-69]. With proteins so essential 
and so versatile, it comes as no surprise that their 
loss can lead to diseases such as cancer [86]. To fully 
understand the link with disease, it should be very 
informative to get genome-wide TETs binding and 
5-mC oxidation intermediate profiles for disorders 
where aberrant methylation patterns are observed. 
Finally, it seems that oxidized DNA modifications 
might be 'epigenetic modifications' per se [48]. 
Perhaps other modifications are hiding on nucleo- 
tides, waiting for their discovery and whatever the 
case may be, an epic scientific hunt has been 
launched. The epigenetic 'block' has room for more 
'new kids' and more exciting discoveries. May future 
studies help to sketch a more complete picture. 
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